# Coded Residual Transform for Generalizable Deep Metric Learning

This code is mainly for reproducing the results reported in our NeurIPS submitted paper [Coded Residual Transform for Generalizable Deep Metric Learning](). 

<img src="img/overview.jpg" width="100%" height="65%">

### Abstract
A fundamental challenge in deep metric learning is the generalization capability of the  feature embedding network model since the embedding network learned on training classes need to be evaluated on new test classes. To address this challenge, in this paper, we introduce a new method called coded residual transform (CRT) for deep metric learning to significantly improve its generalization capability. Specifically, we learn a set of diversified anchor features, project the feature map onto each anchor, and then encode its features using their projection residuals weighted by their correlation coefficients with each anchor. The proposed CRT method has the following two unique characteristics. First, it represents and encodes the feature map from a set of complimentary perspectives based on projections onto diversified anchors. Second, unlike existing transformer-based feature representation approaches which encode the original values of features based on global correlation analysis, the proposed coded residual transform encodes the relative differences between the original features and their projected anchors. Embedding space density and spectral decay analysis show that this multi-perspective projection onto diversified anchors and coded residual representation  are able to achieve significantly improved generalization capability in metric learning.  Finally, to further enhance the generalization performance, we propose to enforce the consistency on their feature similarity matrices between  coded residual transforms with different sizes of projection anchors and embedding dimensions. Our extensive experimental results and ablation studies demonstrate that the proposed CRT method outperform the state-of-the-art deep metric learning methods by large margins and improving upon the current best method by up to 4.28% on the CUB dataset.
 
### Prepare the data and the pretrained model 

The following script will prepare the [CUB](http://www.vision.caltech.edu.s3-us-west-2.amazonaws.com/visipedia-data/CUB-200-2011/CUB_200_2011.tgz) dataset for training by downloading to the ./resource/datasets/ folder; which will then build the data list (train.txt and val.txt or test.txt):

```bash
./scripts/prepare_cub.sh
```

To reproduce the results of our paper. Download the imagenet pretrained models, and put them in the folder:  ~/.cache/torch/checkpoints/.


### Installation

```bash
sudo pip3 install -r requirements.txt
sudo python3 setup.py develop build
```
###  Train and Test on the CUB-200-2011 dataset with the CRT method based on the mit_b1 backbone

```bash
./scripts/run_cub_mitb1.sh
```
Trained models will be saved in the ./output-mitb1-cub/ folder if using the default config.

###  Train and Test on the CUB-200-2011 dataset with the CRT method based on the mit_b2 backbone

```bash
./scripts/run_cub_mitb2.sh
```
Trained models will be saved in the ./output-mitb2-cub/ folder if using the default config.

###  Train and Test on the CUB-200-2011 dataset with the CRT method based on the BN-Inception backbone

```bash
./scripts/run_cub_bninception.sh
```
Trained models will be saved in the ./output-bninception-cub/ folder if using the default config.

###  Train and Test on the CUB-200-2011 dataset with the CRT method based on the ResNet50 backbone

```bash
./scripts/run_cub_resnet50.sh
```
Trained models will be saved in the ./output-resnet50-cub/ folder if using the default config.

###  Train and Test on the CUB-200-2011 dataset with the CRT method based on the GoogleNet backbone

```bash
./scripts/run_cub_googlenet.sh
```
Trained models will be saved in the ./output-googlenet-cub/ folder if using the default config.

### Citation

If you use this method or this code in your research, please cite as:

    @inproceedings{CRT-NeurIPS2022,
    title={Contrastive Bayesian Analysis for Supervised Deep Metric Learning},
    author={Anonymous submitted paper to the NeurIPS 2022},
    booktitle={},
    pages={},
    year={2022}
    }

### License
This code is released for academic research / non-commercial use only.